AITopics | source environment

Collaborating Authors

source environment

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

28f248e9279ac845995c4e9f8af35c2b-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 21:34:52 GMT

action transformation, source environment, target environment, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.05)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > Canada (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

f187a23c3ee681ef6913f31fd6d6446b-Paper.pdf

Neural Information Processing SystemsAug-18-2025, 19:08:41 GMT

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Freiburg (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

Hybrid Cross-domain Robust Reinforcement Learning

Van, Linh Le Pham, Nguyen, Minh Hoang, Le, Hung, Tran, Hung The, Gupta, Sunil

arXiv.org Artificial IntelligenceMay-30-2025

Robust reinforcement learning (RL) aims to learn policies that remain effective despite uncertainties in its environment, which frequently arise in real-world applications due to variations in environment dynamics. The robust RL methods learn a robust policy by maximizing value under the worst-case models within a predefined uncertainty set. Offline robust RL algorithms are particularly promising in scenarios where only a fixed dataset is available and new data cannot be collected. However, these approaches often require extensive offline data, and gathering such datasets for specific tasks in specific environments can be both costly and time-consuming. Using an imperfect simulator offers a faster, cheaper, and safer way to collect data for training, but it can suffer from dynamics mismatch. In this paper, we introduce HYDRO, the first Hybrid Cross-Domain Robust RL framework designed to address these challenges. HYDRO utilizes an online simulator to complement the limited amount of offline datasets in the non-trivial context of robust RL. By measuring and minimizing performance gaps between the simulator and the worst-case models in the uncertainty set, HYDRO employs novel uncertainty filtering and prioritized sampling to select the most relevant and reliable simulator samples. Our extensive experiments demonstrate HYDRO's superior performance over existing methods across various tasks, underscoring its potential to improve sample efficiency in offline robust RL.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2505.23003

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Inversely Learning Transferable Rewards via Abstracted States

Gui, Yikang, Doshi, Prashant

arXiv.org Artificial IntelligenceJan-3-2025

Inverse reinforcement learning (IRL) has progressed significantly toward accurately learning the underlying rewards in both discrete and continuous domains from behavior data. The next advance is to learn {\em intrinsic} preferences in ways that produce useful behavior in settings or tasks which are different but aligned with the observed ones. In the context of robotic applications, this helps integrate robots into processing lines involving new tasks (with shared intrinsic preferences) without programming from scratch. We introduce a method to inversely learn an abstract reward function from behavior trajectories in two or more differing instances of a domain. The abstract reward function is then used to learn task behavior in another separate instance of the domain. This step offers evidence of its transferability and validates its correctness. We evaluate the method on trajectories in tasks from multiple domains in OpenAI's Gym testbed and AssistiveGym and show that the learned abstract reward functions can successfully learn task behaviors in instances of the respective domains, which have not been seen previously.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2501.01669

Country:

North America > United States > Michigan > Wayne County > Detroit (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

On Reward Transferability in Adversarial Inverse Reinforcement Learning: Insights from Random Matrix Theory

Zhang, Yangchun, Zhou, Wang, Zhou, Yirui

arXiv.org Machine LearningDec-30-2024

In the context of inverse reinforcement learning (IRL) with a single expert, adversarial inverse reinforcement learning (AIRL) serves as a foundational approach to providing comprehensive and transferable task descriptions. However, AIRL faces practical performance challenges, primarily stemming from the framework's overly idealized decomposability condition, the unclear proof regarding the potential equilibrium in reward recovery, or questionable robustness in high-dimensional environments. This paper revisits AIRL in \textbf{high-dimensional scenarios where the state space tends to infinity}. Specifically, we first establish a necessary and sufficient condition for reward transferability by examining the rank of the matrix derived from subtracting the identity matrix from the transition matrix. Furthermore, leveraging random matrix theory, we analyze the spectral distribution of this matrix, demonstrating that our rank criterion holds with high probability even when the transition matrices are unobservable. This suggests that the limitations on transfer are not inherent to the AIRL framework itself, but are instead related to the training variance of the reinforcement learning algorithms employed within it. Based on this insight, we propose a hybrid framework that integrates on-policy proximal policy optimization in the source environment with off-policy soft actor-critic in the target environment, leading to significant improvements in reward transfer effectiveness.

singular value, source environment, target environment, (14 more...)

arXiv.org Machine Learning

2410.07643

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

TEA: Trajectory Encoding Augmentation for Robust and Transferable Policies in Offline Reinforcement Learning

Ormancı, Batıkan Bora, Swazinna, Phillip, Udluft, Steffen, Runkler, Thomas A.

arXiv.org Artificial IntelligenceNov-28-2024

In this paper, we investigate offline reinforcement learning (RL) with the goal of training a single robust policy that generalizes effectively across environments with unseen dynamics. We propose a novel approach, Trajectory Encoding Augmentation (TEA), which extends the state space by integrating latent representations of environmental dynamics obtained from sequence encoders, such as AutoEncoders. Our findings show that incorporating these encodings with TEA improves the transferability of a single policy to novel environments with new dynamics, surpassing methods that rely solely on unmodified states. These results indicate that TEA captures critical, environment-specific characteristics, enabling RL agents to generalize effectively across dynamic conditions.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2411.19133

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)

Genre: Research Report > New Finding (0.88)

Industry: Transportation (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Return Augmented Decision Transformer for Off-Dynamics Reinforcement Learning

Wang, Ruhan, Yang, Yu, Liu, Zhishuai, Zhou, Dongruo, Xu, Pan

arXiv.org Machine LearningOct-30-2024

We study offline off-dynamics reinforcement learning (RL) to utilize data from an easily accessible source domain to enhance policy learning in a target domain with limited data. Our approach centers on return-conditioned supervised learning (RCSL), particularly focusing on the decision transformer (DT), which can predict actions conditioned on desired return guidance and complete trajectory history. Previous works tackle the dynamics shift problem by augmenting the reward in the trajectory from the source domain to match the optimal trajectory in the target domain. However, this strategy can not be directly applicable in RCSL owing to (1) the unique form of the RCSL policy class, which explicitly depends on the return, and (2) the absence of a straightforward representation of the optimal trajectory distribution. We propose the Return Augmented Decision Transformer (RADT) method, where we augment the return in the source domain by aligning its distribution with that in the target domain. We provide the theoretical analysis demonstrating that the RCSL policy learned from RADT achieves the same level of suboptimality as would be obtained without a dynamics shift. We introduce two practical implementations RADT-DARA and RADT-MV respectively. Extensive experiments conducted on D4RL datasets reveal that our methods generally outperform dynamic programming based methods in off-dynamics RL scenarios.

algorithm, dataset, target environment, (15 more...)

arXiv.org Machine Learning

2410.2345

Country: North America > United States > Indiana (0.04)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Robust prediction under missingness shifts

Rockenschaub, Patrick, Xian, Zhicong, Zamanian, Alireza, Piperno, Marta, Ciora, Octavia-Andreea, Pachl, Elisabeth, Ahmidi, Narges

arXiv.org Machine LearningJun-24-2024

Prediction becomes more challenging with missing covariates. What method is chosen to handle missingness can greatly affect how models perform. In many real-world problems, the best prediction performance is achieved by models that can leverage the informative nature of a value being missing. Yet, the reasons why a covariate goes missing can change once a model is deployed in practice. If such a missingness shift occurs, the conditional probability of a value being missing differs in the target data. Prediction performance in the source data may no longer be a good selection criterion, and approaches that do not rely on informative missingness may be preferable. However, we show that the Bayes predictor remains unchanged by ignorable shifts for which the probability of missingness only depends on observed data. Any consistent estimator of the Bayes predictor may therefore result in robust prediction under those conditions, although we show empirically that different methods appear robust to different types of shifts. If the missingness shift is non-ignorable, the Bayes predictor may change due to the shift. While neither approach recovers the Bayes predictor in this case, we found empirically that disregarding missingness was most beneficial when it was highly informative.

missingness, prediction, target environment, (13 more...)

arXiv.org Machine Learning

2406.16484

Country:

North America > United States (0.28)
Oceania > Guam (0.04)
North America > Puerto Rico (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Public Health (0.93)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.67)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Multi-Agent Transfer Learning via Temporal Contrastive Learning

Zeng, Weihao, Campbell, Joseph, Stepputtis, Simon, Sycara, Katia

arXiv.org Artificial IntelligenceJun-3-2024

This paper introduces a novel transfer learning framework for deep multi-agent reinforcement learning. The approach automatically combines goal-conditioned policies with temporal contrastive learning to discover meaningful sub-goals. The approach involves pre-training a goal-conditioned agent, finetuning it on the target domain, and using contrastive learning to construct a planning graph that guides the agent via sub-goals. Experiments on multi-agent coordination Overcooked tasks demonstrate improved sample efficiency, the ability to solve sparse-reward and long-horizon problems, and enhanced interpretability compared to baselines. The results highlight the effectiveness of integrating goal-conditioned policies with unsupervised temporal abstraction learning for complex multi-agent transfer learning. Compared to state-of-the-art baselines, our method achieves the same or better performances while requiring only 21.7% of the training samples.

agent, learning, target environment, (12 more...)

arXiv.org Artificial Intelligence

2406.01377

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback